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Background 

Quantification of edema and scar maps with cardiac MR 
images (cMRIs) enables effective Radiofrequency Abla- 
tion (RFA) of arrhythmias during the Electrophysiology 
(EP) procedure [1]. This demonstrates the paramount 
advantage over the EP catheterization under X-ray and 
ultrasound guidance. High-contrast and resolution 
cMRIs can be obtained preoperatively as a EP roadmap 
for surgical planning of RFA, whilst real-time MRI (rt- 
MRI) can be used to guide catheterization and update 
the cMRI model [2] to provide intraoperative visualiza- 
tion of a 3D vascular map. A fast and efficient technique 
of non-rigid image co-registration is required. Although 
feature-based registration methods can be rapidly pro- 
cessed by computing sparse features, the outcome is 
sensitive to blurred images with artifacts that happens 
regularly in low-resolution rt-MRI, causing significant 
errors in feature detections. With the use of Field-pro- 
grammable Gate Array (FPGA), we hypothesized that 
novel data structure and architecture of memory access 
can allow robust registration based on comparison of 
image intensity patterns, thus fulfilling the real-time 
requirements for clinical practice. 



FPGA-based computation kernel of Demons is proposed. 
Multiple pixel/voxel processing units (PUs) are placed in 
the FPGA. Each has its own pixel/voxel memory. Input 
pixels/voxels are processed as a data stream that propa- 
gate via the kernel. The workloads are then distributed to 
the PUs such that neighboring gradients are connected 
by neighboring PUs, hence memory bandwidth is further 
reduced. Rapid computation of image registration is 
achieved by 1) the highly-customized PUs; 2) the paralle- 
lism of multiple PUs and pixel/voxel memories; and 3) 
bandwidth reduction through inter-PUs information 
exchange channels. 

Results 

Figure 1 shows Demons results of 2D cMRIs (Gradient 
Echo). Figure 2a shows a robust registration, even given 
the poor-quality intraoperative image with motion arti- 
facts. The 3D Demons was applied to the corresponding 
images in 3D. An FPGA (Xilinx® Virtex7-XC7V2000T) 
was used to investigate the accelerated performance. 
Figure 2b depicts the computational time required for the 
3D images in various levels of resolution, > 40 times faster 
than the state-of-the-art acceleration techniques [3,4]. 



Methods 

Acquiring image gradient is a common step in intensity- 
based registration methods [3] (e.g. Demons [4]), but also 
the primary computation bottleneck. Image gradient 
computation requires information of pixel/voxel neigh- 
borhood, leading to large amount of non-coalesced mem- 
ory accesses and floating point operations. A customized 



^College of Engineering, University of Georgia, Athens, Georgia, USA 
Full list of author information is available at the end of the article 



Conclusions 

The performance of the proposed computing architecture 
demonstrates its high potential for accelerating registra- 
tion of 3D-gated MRI images to improve visualization of 
the MRI-guided cardiac therapy. 

Funding 

NIH U41-RR019703, R43 HL110427-01, AHA 
10SDG261039, EPSRC and Croucher Foundation 
Fellowship. 



o 



Bion/led Central 



© 2014 Kwok et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons 
Attribution License (http://creativecommons.Org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in 
any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http:// 
creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherw/ise stated. 



Kwok ef al. Journal of Cardiovascular Magnetic 
Resonance 2014, 16(Suppl 1):W11 
http://www.jcmr-online.com/content/16/S1/W1 1 



Page 2 of 3 



1 

m 


f ^^^^ s 























Figure 1 The deformed images co-registered between the intra-operative images (with scanning resolution: 66p*192s) and the pre- 
operative images (with 132p*192s). Both are on the same plane. Each Demons trial took less than 10 ms with the use of proposed 
computing architecture. 
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Figure 2 (a) The intra-operative image interfered with motion artifacts (pointed by the red arrow) The deformed image was transformed 
by the grid applied to the pre-operative image; (b) Computational time of FPGA-based 3D Demons processed with single-and double-precision, 
compared with the graphics processing unit (GPU)-based Demons reported in [3] and [4]. Around 100 iterations were required to complete the 
Demons trials. Only [4] adopted original Demons force computation which involves fewer numbers of gradient operations. 
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